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Abstract 

We develop an efficient numerical scheme to solve accurately the set of nonlinear integral equa- 
tions derived previously in (Saichev and Sornette, 2007), which describes the distribution of inter- 
event times in the framework of a general model of earthquake clustering with long memory. 
Detailed comparisons between the linear and nonlinear versions of the theory and direct synthetic 
catalogs show that the nonlinear theory provides an excellent fit to the synthetic catalogs, while 
there are significant biases resulting from the use of the linear approximation. We then address 
the suggestions proposed by some authors to use the empirical distribution of inter-event times 
to obtain a better determination of the so-called clustering parameter. Our theory and tests 
against synthetic and empirical catalogs find a rather dramatic lack of power for the distribution 
of inter-event times to distinguish between quite different sets of parameters, casting doubt on the 
usefulness of this statistics for the specific purpose of identifying the clustering parameter. 
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I. INTRODUCTION 



Most complex systems of interest in the natural and social sciences exhibit intermittent 
bursts of activity interspersed within long times of reduced activity. A simple metric to 
characterize this property consists in the distribution of recurrence (also called "waiting" or 
"inter-event") times between (suitably defined) events. Recently, the literature has under- 
gone itself a burst of publication activity on this topic, motivated by the idea that distribu- 
tions of recurrence times may be one of the most important complexity measures for both 
random fields and nonlinear dynamical systems [13]. The applications include recurrence 
time and anomalous transport [39], waiting times between earthquakes [3, 4, 28, 33, 34] and 
rock fractures [12], time intervals between consecutive e-mails [2] and between web browsing, 
library visits and stock trading [38]. 

Much of the recent interest of the statistical physics community focused on applying 
scaling techniques, which are common tools in the study of critical phenomena, to the 
statistics of inter-earthquake recurrence times or waiting times [1, 3-9, 11, 27]. Many of 
the claims made in these recent articles on recurrence statistics have either been challenged, 
refuted or explained by previously known facts about earthquake statistics [14, 25, 26, 29, 33, 
34]. In particular, two of us [33, 34] have developed a general theory of the statistics of inter- 
event times in the framework of the general class of self-excited Hawkes conditional Poisson 
processes [16-18] adapted to modeling seismicity. The corresponding model is known as 
the epidemic-type aftershock sequence (ETAS) model, in which any earthquake may trigger 
other earthquakes, which in turn may trigger more, and so on. Introduced in slightly different 
forms by Kagan and Knopoff [24] and Ogata [30], the model describes statistically the 
spatio-temporal clustering of seismicity. Using three well-known statistical laws of statistical 
seismicity (the Gutenberg- Richter, the Omori law and the productivity law), the empirical 
observations on the distribution of earthquake recurrence times can be explained within 
this model without invoking additional mechanisms other than the well-known fact that 
earthquakes can trigger other earthquakes [33, 34]. 

A recent development is the proposition that inter-event time distributions may provide 
a new and more reliable way to measure of the so-called background earthquake activity 
[14, 15]. This question arises as follows: if earthquakes trigger other earthquakes, how 
much of the observed seismicity is due to past seismicity (endogenous origin) and how 
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much is resulting from an "external" driving source (exogenous origin) often referred to 
as "background" seismicity thought to reflect the driving tectonic forces at large scales. 
This question obviously generalizes to any system in which future events may be in part 
triggered by past events, such as in commercial sales [35] and web browsing activity [10, 38]. 
Within the ETAS framework, the fraction of events in a given catalog which have been 
triggered by previous events can be shown [21] to be nothing but the so-called branching 
ratio n, defined mathematically as the average number of first-generation events triggered by 
a given preceding event [20]. Reciprocally, the fraction of background events is equal to 1 — n 
(note that these models assume that the triggering branching-like processes are sub-critical: 
n < 1). The degree to which the parameter n can be retrieved from the distribution of 
inter-event times relies on departure from universality pointed out by Hainzl et al.[14] and 
two of us [33, 34]. In this respect, the ETAS model provides an excellent training ground. 
Using synthetic catalogs generated with the ETAS model, Hainzl et al.[14] found that the 
estimation of n using the distribution of inter-event times is better than from the application 
of a standard declustering procedure [32]. 

More progress can be achieved by a better understanding of the sensitivity of the distri- 
bution of inter-event times to the branching ratio n. In principle, the theoretical framework 
based on the technique of probability generating functions developed in Ref. [33, 34] pro- 
vides an ideal approach to this problem. However, this previous effort was limited on two 
accounts. First, while Saichev and Sornette derived the full exact nonlinear integral equa- 
tions of the problem, they ended solving their linearized versions in order to derive the 
distribution of inter-event times. The present paper keeps the full nonlinear integral equa- 
tions and shows that the linear simplification leads to systematic biases in the estimations 
of the key parameters of the ETAS model, and in particular of the branching ratio n which 
has been the focus of recent interest in the seismological community [14, 15]. Secondly, 
only preliminary sensitivity analysis was performed with respect to n. The present paper 
presents a detailed treatment of the full exact nonlinear equations providing the distribution 
of inter-event times for the ETAS model and discusses how well n can be constrained. 

The organization of the presentation is as follows. Section 2 describes the theoretical 
framework developed by Saichev and Sornette [33, 34] and summarizes their main results, 
essentially based on a linear approximation to the full nonlinear equations that they derived. 
Section 3 focuses on these nonlinear equations and presents the numerical scheme that has 
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been used to solve them. Detailed comparisons between the linear and nonlinear versions of 
the theory and direct ETAS simulated catalogs are presented. With improved adaptive mesh 
grids, it is shown that the nonlinear theory provides an excellent fit to the synthetic ETAS 
catalogs, while there are significant biases resulting from the use of the linear approximation. 
Section 3 concludes by a synthetic test demonstrating the possibility to use the nonlinear 
theory to invert for two of the unknown parameters, if constraints exist on the other three 
parameters of the model. Section 4 applies these results to the empirical data set treated by 
Corral [3]. We find a rather dramatic lack of power for the distribution of inter-event times 
to distinguish between quite different sets of parameters, casting doubt on the usefulness of 
this statistics for the specific purpose of identifying the clustering parameter n. 

II. SUMMARY OF RESULTS OBTAINED BY SAICHEV AND SORNETTE [33, 
34] 

A. The ETAS model 

The ETAS model views the flow of future seismicity as being triggered by past seismicity 
and by a few background events. Each earthquake is assumed to have the potential to trigger 
future earthquakes according to three laws capturing the nature of seismicity viewed as a 
marked point-process. We restrict this study to the temporal domain only, summing over the 
whole spatial domain of interest. First, the magnitude of any earthquake, regardless of time, 
space or magnitude of the mother shock, is drawn randomly from the exponential Gutenberg- 
Richter (GR) law. Its complementary cumulative probability distribution is expressed as 

Q(m) = i(T 6 ( m - m °). (1) 

where the constant exponent b is typically close to one, and the cut-off m serves to normalize 
the pdf. We do not consider the influence of an upper cut-off m max , usually estimated in 
the range 8 — 9.5 [22, 31], because its impact is quite weak in the calculations. 

Second, the model assumes that direct aftershocks are distributed in time according to 
the modified "direct" Omori law (see Ref.[37] and references therein). Denoting the usual 
Omori law exponent by p = 1 + 6 and assuming 9 > 0, the normalized pdf of the Omori law 
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can be written as 

m - ■ m 

where t is the time since the earthquake and c is a regularizing constant preventing the 
divergence of the rate at small times. 

Third, the number of direct aftershocks of an event of magnitude m is assumed to follow 
the productivity law: 

p(m) = k ■ 10 a(m - mo) , m < m , (3) 

where k and a are constants. Note that the productivity law (3) is zero below the cut-off 
mo, i.e., earthquakes smaller than m do not trigger other earthquakes. The existence of 
the small-magnitude cut-off m acts as a "ultra-violet" cut-off which is necessary to ensure 
the convergence of the models of triggered seismicity for a < b. 

These laws are combined with the fundamental defining ETAS equation 

A(t) = u + ]T p(m,)$(t - U) , (4) 

i\t t <t 

giving the conditional Poisson intensity A(i) for the occurrence of the next event, conditioned 
on the history of past events 'Hit) = {...(ti,m,i), (ti,mi)}. Here, U < t (respectively to*) 
is the time of occurrence (respectively magnitude) of the i-th earthquake counted from the 
present time t. The term uj is the background contribution assumed to embody the effect 
of the large scale tectonic driving. Taking the expectation of (4) yields the average seismic 
rate 

E[A(f)] = , (5) 

where n is the key parameter of the ETAS model defined as the number of direct aftershocks 
per earthquake, averaged over all magnitudes: 

n = I \dQ(m)/dm\p(m)dm = — -. (6) 

Jm (b — a) 

As recalled in the introduction, the fraction of events in a given catalog which have been 
triggered by previous events can be shown [21] to be exactly given by this "branching ratio" 
n. 
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B. Mathematical formulation for the determination of the distribution of inter- 
event times 



Saichev and Sornette [33, 34] used the formalism of probability generating functions to 
calculate from first principles for the ETAS model the distribution H(t, m) of waiting times 
between events of magnitudes larger than or equal to m in a region of seismicity rate X(m). 
In agreement with previous works [3, 4, 28, 33, 34], we express H(t, m) as 

H(r,m) ~ A(m)/(A(m)r) , (7) 

so that the dependence on the local seismicity rate is absorbed in the variable A(m) while 
the more general functional form is captured by the function f(x). Saichev and Sornette 
first used the general relation 

1 d 2 P(T,m) 

H{r,m)= y 2 ' , 8 

\{m) dr z 

where P(r, m) is the probability of absence of events of magnitude larger than or equal to 
m within the interval [t,t + r]. The following expression was obtained 

(OO T \ 

- J N(t, r, m)dt - J N_(t', m)dr' j , (9) 
o o / 

with the auxiliary functions N_(t, m) and N(t, r, m) given by the following nonlinear integral 

implicit equations 

AL(r,m) = 1 - #[$(t) ®AL(r,m)] + Q{m)^[Q' lh {m)^{T) ® iV_(r,m)] , (10) 
N(t,r,m) = 1 ® N(t, r, m) + $(t + r) ® iV_(r,m)] , (11) 

where 

7 = - (12) 

is assumed larger than 1 (but probably close to 1). iV_(r, m) and N(t,T,m) have the 
following probabilistic interpretation: 

• iV_ (r, m) is the probability that either some background earthquake occurs in the time 
window [t, t + r] which has a magnitude larger than m, or, if its magnitude is smaller 
than m, given that it occurred at time t, that it will generate at least one aftershock (or 
their subsequent daughters) of magnitude larger than m within the interval [t,t + r]. 
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• Analogously, N(t,r,m) is the probability that some background earthquake of mag- 
nitude larger than to, occurring at instant t — 0, will generate at least one aftershock 
of magnitude larger than to within the time interval [t, t + r]. 

The symbol <E> stands for the convolution operation over the variable r in (10) and in the 
second part of the argument of the function ^ in (11), and over the variable t in the first 
part of the argument of the function \l/ in (11). The function is expressed through the 
incomplete Gamma-function: 

=7(^) 7 r(- 7 ,Kz). . (13) 
For convenience, we use its expansion in powers of z: 

V(z) = l-nz + fiz 1 -rjz 2 + ... (14) 

where f3 and 77 are two numerical constants which can be expressed in terms of k and 7. 

It is clear that the distribution H(t, to) of inter-event times r depends on the magnitude 
cut-off m of events used to construct this distribution. A natural value for this cut-off 
is the magnitude m d of so-called completeness of the considered catalog, above which all 
earthquakes are thought to be recorded by the existing seismic network. This detection 
threshold rrid has evolved over time together with the technology and density of the seismic 
networks. In our comparison with Corral's analysis presented below, we use the values m d 
reported by him for each corresponding catalog. 

The goal of this paper sequel is to calculate the full solution of (10,11) leading to the 
expression of H{r, to) given by (8) and to compare this prediction to the data analysis 
performed by Corral [3] in order to bracket the three key parameters of the ETAS model, 
9, 7 and n. The first one describes the direct Omori law. The second one, given the well- 
known 6-value, provides a new estimate for the productivity exponent a. The third one n 
is directly associated with the fundamental question in seismicity of how much clustering 
occurs in recorded catalogs, as discussed in the introduction. 

C. Analytical solution using the linear approximation 

The determination of the form of f(x) defined in (7) can be analytically resolved only 
by reducing equations (10) and (11) to their linear approximations, i.e., when only the two 
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first summands of the expansion (14) are considered: 

m(z) = l-nz. (15) 

Using this approximation, Saichev and Sornette [33, 34] introduces for convenience the 
auxiliary function g(r,m d ) defined by 

g (r,m d ) = l-^^, N.{m d )= Km (N_(r,m d )). (16) 

With the linear approximation (15), we have 

iV_(m,) = ^^, S = n[l-Q 1 - 1 ^(m d )] (17) 
l — o 

With (15), the main remaining problem of solving equation (11) can be done by representing 
the first integral in (9) via the second one, so that one just needs to determine the function 
g(r,md), from which one obtains 

P(r) = exp [-^r - A J g(r', m d )dr^ , (18) 

where A = — y^. Here, we have dropped the explicit dependence on the magnitude, 
except in the function g which is written as dependent on the threshold magnitude m d . 
Then, using a quasi-static approximation for g(r,m d ), Saichev and Sornette obtained 

* 1 JTi(r) ■ (19) 

where 

/'OO (fi 

*(t>)dt> = j—^ . (20) 

Using the dimensionless variable x = At, where A = uQ(m d )/(l — 5) and uj is the seismic 
rate of spontaneous seismic sources, expression (8) with (7) yields 

/w = ^V|_mJ > ^ imJ=P(T) = P g) > (21) 

leading finally to the dimensionless distribution of inter-event times: 

f(x) = (6v(l - 5)e~ 6 {x + e)*-y (a, m d , 9) + [rj + ug(x, m d , 9)} 2 ) <p(x, m d ), (22) 
Here, e = Ac, v = (1 - n)A and g(x, m d , 9) = g(r, m d ) = g(x/ A, m d ). 
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Fig. 1 reproduces the comparison obtained in Ref.[33, 34] between the function f(x) 
given by (22) and Corral's phenomenological functional fit [3]. Ref.[33, 34] found that 
expression (22) can fit rather well the empirical distributions of inter-event times, so as to 
even improve on Corral's fit for short time scales, with a ~ 0.7 — 0.9 and n tv 0.8 — 1.0. 
These rather large intervals reflect a corresponding insensitivity of the quantitative shape of 
f(x) with respect to a and n. An analysis of the impact of the first nonlinear term fiz 1 in 
the expansion (14) of in the nonlinear equations (10) and (11) led Saichev and Sornette 
to expect "weak departures from the results obtained with the linear approximation" . They 
added "It thus appears that the statistics of recurrence times is not sensitive enough to 
reveal the importance of these nonlinear corrections which describe the effect of cascades of 
generations of aftershocks." It turns out that this statement was premature, as shown by 
our full treatment of the nonlinear equations. In particular, we identify significant biases 
in the estimation of the parameter n when using the linear approximation. The reason lies 
in the fact that, for a close to 1 (specifically 1 < 7 < 1.2) as found in Ref.[33, 34], the 
first nonlinear correction fiz 1 is only weakly nonlinear. We show below that the inclusion of 
the next term ~ z 2 changes somewhat the conclusions. In contrast, the higher-order terms 
beyond z 2 do not change the conclusions. 



III. ANALYSIS OF THE FULL NONLINEAR EQUATIONS (10) AND (11) 
A. Preparation of the equations and notations 

Using the first four summands of the expansion (14), equations (10,11) can be written in 
the following form: 

N-(r,m d ) = ^^ + 5^(r)^N4r,m d ) + aN4m d )Mr)^N_(r,m d )] 2 , (23) 

N(t, r, m d ) = n[$(t) <g> N(t, r, m d ) + $(t + r) <g> N_ (r, m d )\- 

-P[$(t) <g> N(t, r, m d ) + $(t + r) <g) AL(r, m d )f+ (24) 
+77[$(i) ®N(t,T, m d ) + $(t + r) ® iV_(r, m d )] 2 , 

where 

a = r ] [l-Q 1 - 2 ^(m d )]. (25) 
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In the present case, N-(m d ) defined in (16) is not identical to Q{m d ) /(1 — S) as in the linear 
approximation. Instead, N_{m d ) is the root of a simple quadratic equation. However, the 
difference can be small: for instance, for m d — m = 2, 9 = 0.03,7 — 1-2, n = 0.9, we have 
Q(m d )/(1 - 5) ~ 1.93 * 10~ 2 compared with N_(m d ) ~ 1.90 * 10~ 2 . 

Defining the dimensionless variables x — Xt, y — At with A = ujN_{m d ) and the functions 

M_(j/) = iV_(|,m d ), M(z,y) = 7vg,^,m d ), (26) 

$ e (x) = A$ . (27) 

the equations (23) and (24) become 

M-(y) = ^y^- + 6$ e (y) ® M_(y) + aA[$ e (y) ® M_(y)] 2 , (28) 

M(x,y) =n[$ e (x) ® M(x,y) + <$> t {x + y) <g> M_(y)]- 

-/3[$ e (x) ® M(x, y) + $ e (x + y) ® M_(y)f+ (29) 
+^[$ e (a;) <g> M (x, y) + <$> e {x + y)® M_ (y)] 2 . 
For the numerical calculations, we transform equation (28) into an equation for the new 
function g{y) — 1 — M_(y)/A: 

«?(y) = 1 - 5 - + 5 [ae (y) + $ e (y) ® y(y)] - 

-aA[a e (y) + <I> e (y)®y(y)-l] 2 , a e (y) = J $ e (x')dx' . 

X 

Solving for g(y) instead of M_(y) is more efficient numerically because g(y) is a monotoni- 
cally decreasing function unlike M_(y). This ensures a faster numerical convergence and a 
weaker sensitivity to the finite mesh size of the discretization scheme. Equations (29) and 
(30) form the basis for our numerical calculations. 



B. Numerical solution 

The first step is to solve (30) for the function g(y) that we reformulate as equation (34) 
given in the Appendix A. we use the method of successive approximations to obtain the value 
of the function g(yi) on a regular grid y { = y + i-dy with a small mesh dy. The performance 
of this method is discussed in Appendix A in the context of the linear approximation. 
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Fig.2 shows the difference between the quasi-static approximation (19) and the solution 
of the nonlinear equation (30). One can observe that the nonlinear solution lies under 
the quasi-static approximation, i.e., it gives a correction which is in the opposite direction 
compared with the linear solution (see fig. 3 ). 

The next step is to determine the function M(x,y), obtained as the solution of equa- 
tion (29). Note that the convolution operation involving M_(y) can now be expressed in 
terms of the known function g(y): 

<5> e (x + y) <g) M_{y) = a e (t) - a e (t + r) - <5> t (x + y) ® g{y) . (31) 

In the nonlinear case, the function M(x, y) cannot be represented analytically through the 
function M_(y) as in the linear case. This means that we have to calculate M(x, y), a func- 
tion of two variables (which significantly slows down the calculation speed). Equation (31) 
implies that the functions § t (x + y) and g(y) should be estimated on the same grid points 
(x + y)k and y\. Therefore, the mesh sizes of x and of y should be identical: dx = dy. 
In order to determine the probability P{r) = ip(y, ma), we must also estimate the integral 

oo 

jM(x,y)dy. (32) 
o 

This requires to span a large set of y values in order to approximate the theoretical one 
[0; +oo] which, together with the condition dx = dy, make the problem very demanding in 
memory capacity. For example, for x G [0; 1] and y G [0; 1] with dy = dx ~ 10~ 4 , M(x,y) 
is a matrix with 10 8 elements. To alleviate this burden on memory capacity, we divide the 
y-interval into smaller intervals y n ,n = 1,2, ...,7V and we determine the matrix M(x,y n ) 
consecutively for each of these sub-intervals. Having determined the probability function 
ip(y, rrid) on each such small intervals y n , we use a simple smoothing polynomial interpolating 
scheme in order to prevent jumps in its second order derivative. 

An example of the resulting probability P(y) defined in (9) is shown in fig. 4, which identi- 
fies a significant difference between the quasi-static approximation presented in Ref. [33, 34] 
and the nonlinear solution. 

C. Comparison between the linear and nonlinear versions of the theory and direct 
ETAS simulated catalogs 

We present a comparison for the pdf of inter-event times obtained with 
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(i) the linear analytic quasi-static approximation, 

(ii) the numerical solution of the nonlinear equations (29) and (30) and 

(iii) "exact" synthetic catalogs. 

The two former solutions are obtained by taking the second order derivative of the functions 
P(y) ((f(y,m)) shown in fig. 4, according to (21). The synthetic catalog was obtained using 
the method described in Appendix B. The ETAS parameters used here are: n = 0.9, 9 = 
0.05, 7 = 1.1, m d — m = 0. 

Fig. 5 shows the three pdf's obtained by the three methods. For the "exact" pdf re- 
constructed from a synthetic ETAS catalog, we show both the histogram and a fit using 
a function constructed as the ratio between a polynomial function of order 5 divided by 
another polynomial function of order 4. These functions are expressed in terms of the loga- 
rithm of the dimensionless inter-event time. This fit has no pretence of rigor, it only provides 
a useful guide to the eye. 

Fig. 5 shows that the linear theory is significantly in error while the nonlinear theory 
provides an excellent agreement with the "exact" pdf for values of the dimensionless time 
interval x > 4 • 10~ 2 . For smaller x's, the difference is due to numerical errors in the 
treatment of equations (29) and (30), which can be removed by using an adaptive mesh 
size, as discussed shortly below. The discrepancy between the "exact" pdf and the one 
obtained using the linear approximation implies that a fit of empirical pdf's using the linear 
theory will likely provide spurious values for the significant parameters n and 7. Indeed, a 
good fit of the "exact" synthetic pdf shown in fig. 5 is obtained with the linear theory using 
effective parameters n e s = 0.86 and j c s = 1.28, showing here a systematic bias of 5% in the 
determination of n and over 16% in the determination of 7 (and therefore of the productivity 
a). 

Let us now return to the discrepancy between the nonlinear theory and the "exact" pdf 
observed for x < 4 • 10~ 2 in fig. 5. Two possible factors need to be discussed: 

1. impact of terms of order higher than z 2 in the expansion (14) of the function ^f(z) 
given by (13); 

2. lack of convergence of the numerical scheme to solve equations (29) and (30), due to 
a too large mesh size. 
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Fig. 6 rules out the first explanation, since the solution of the nonlinear equations obtained 
by using the full expression (13) in the calculation of equation (29) is undistinguishable 
from the solution obtained with the expansion (14). This check and other tests confirm that 
there is no need to complicate the computations by adding the calculation of the incomplete 
gamma function. This is important when using our theory for inverting the parameters from 
fits to empirical data, for instance. 

With respect to the second factor, we improve the numerical precision by varying the 
mesh size dx so that dx/x remains approximately equal to 1CT 4 for x < 0.1 while dx is fixed 
at 10~ 4 for x > 0.1. Thus, for x ~ 0.001, we have chosen dx ~ 10 7 , which is the limit that 
we have been able handle due to limited numerical precision of the computer. Fig 7 shows 
for the example n = 0.86,0 = 0.05,7 = 1-11 that the problem previously noted in fig. 5 
disappears: there is a good agreement between the "exact" pdf obtained from the synthetic 
ETAS catalog and the nonlinear theory down to x — 0.001. 

D. Test of the inversion of the parameters n and 7 using the nonlinear theory 
from a synthetic ETAS catalog 

Consider a synthetic ETAS catalog of inter-event times for some fixed values of the 
parameters n cat , 9 cat , 7 cat and {m d — m } cat . In this example, we take specifically n cat = 
0.86, 9 cat = 0.05, 7 ca t = 1.11 and {m d — m } C at = 0. Figure 8 shows the "exact" pdf of inter- 
event times (crosses), which mimics a real-life situation with statistical fluctuations. In a 
real-life experiment, one would like to use the nonlinear theory to invert for the unknown 
parameters n,9^,m d — m . In this goal, using the nonlinear theory, we calculate the 
predicted pdf /nl(^) of inter-event times for fixed values of the parameters n, 9, 7 and m d — 
m . For a given set of these four parameters, we construct the mean-square error of the 
logarithm of the pdf over the N inter-event times of the catalog: 

N 

LLS(n,9,-/,m d - m ) = J2( [n hL(x i ) -\nf cat (xi)) 2 , (33) 

i=i 

where /nl(^«) is the predicted pdf at the dimensionless inter-event time Xi given by the 
nonlinear theory and f ca ,t(xi) is the corresponding empirical pdf (in the synthetic catalog). 
LLS(n, 9, 7, m d — m ) quantifies how well the nonlinear prediction for the pdf of inter-event 
times can describe the (synthetic) data. The unknown parameters n, 9, 7, m d — m are then 
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obtained by finding the quadruplets which makes LLS(n, 9, 7, — m ) minimum. 

In practice, given the computational cost of the numerical solution of the nonlinear the- 
ory, we have found unpractical to explore systematically the four dimensional parameter 
space (with super-computer resources, this is not excluded but the next section removes the 
motivation to explore further this option as we will see). For the sake of demonstration, we 
assume that we already know 9 = 9 cat = 0.05 and — m = {rrid — m } C at = 0. We are 
then left with searching for the remaining parameters n and 7. For this, we form a grid in 
the (n, 7) plane over which we find the minimum of LLS(n, 7; 9 cat , 9 cat ). The corresponding 
inverted values are n hcst fit = 0.94 ± 0.02 and 7b es t fit = 1-13 ± 0.02. The recovery of 7 (and 
therefore of the productivity exponent a) is good, while there is 9% error on n. Figure 8 
shows that this best pdf fits well the "exact" pdf obtained from the ETAS catalog and is 
not far from the pdf predicted by the nonlinear theory with the true parameters. 

IV. ON THE LACK OF POWER OF THE PDF OF INTER-EVENT TIMES TO 
INVERT FOR THE CLUSTERING PARAMETER n AND OTHER PARAMETERS 

The title of this section is motivated by fig.9 comparing the pdf of inter-event times in 
synthetic ETAS catalogs with three different sets of parameters and the pdf 's obtained by 
Corral in different regions of the world [3]. 

First, one can observe that the three triplets (n = 0.96, 9 = 0.05,7 — !•!)> ( n — 0.6,0 = 
0.05,7 — 1-2) and (n = 0.5,0 = 0.15,7 = 1.1) give almost the same pdf's over the whole 
range of dimensionless inter-event times 10~ 4 < x < 15. For x > 0.1, the data collapse is 
almost perfect, while the scatter is larger for the smaller x values. This is bad news for the 
determination of the clustering parameter n in particular, since relatively small changes in 
the Omori law parameter 9 and in the productivity law parameter 7 can compensate for a 
quite significant change in the branching ratio n. This suggests that previous claims on the 
use of the pdf of inter-event times to extract efficiently the clustering parameter have been 
over-optimistic [14, 15]. 

Second, fig.9 shows that the three chosen triplets of parameters are basically equally good 
at fitting Corral's data sets [3]. We note that, again for x > 0.1, all empirical data and ETAS 
simulations present an almost perfect collapse on a quasi- universal curve. Larger scatter 
characterize smaller x values, which is the region to scrutinize in the hope of extracting 
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some useful constraints on the parameter values. 

Actually, the situation is even more involved since, in addition to the parameters n, 9 
and 7, a genuine inversion needs also to determine m d — m (whose impact is significant 
as shown in Ref. [36]) as well as the regularizing constant c in the Omori law (2). Fig. 10 
presents the pdf's calculated with the nonlinear theory for different sets of four of these 
parameters (n, 7, m d — mo, c) with a fixed 9 = 0.05, together with Corral's data. For the 
pdf's obtained from the nonlinear theory, we used all combinations between the three values 
n = 0.64,0.8,0.96, the three values 7 = 1.01, 1.07, 1.13, two values m d — m = 0.1, 1 and 
two values e(c) = Ac = 10~ 4 , 10~ 5 , corresponding to a total of 36 combinations. One can 
observe roughly two clusters among these 36 theoretical curves. All curves with 7 = 1.01 
belong to the lower cluster, which is clearly not fitting the data. The upper cluster, which is 
in better agreement with the data, corresponds to the larger values 7 = 1.07 and 1.13. This 
suggests that the productivity parameter a is likely to be smaller than (instead of equal 
to) the b- value of the Gutenberg- Richter law (recall that 7 = b/a). There is also a smaller 
impact of e and of — m : in general, higher values of these parameters displace the pdf 
downward. 

The comparison between these 36 theoretical pdf's and Corral's data in figure 10 shows 
that there are large uncertainties in the inversion of the parameters. One could argue that 
the parameter 9 should perhaps be modified to a value different from 0.05 in order to better 
describe the data for small re's. But this region is very sensitive to errors such as resulting 
from incompleteness [19, 23] and its use is problematic. 

V. CONCLUSION 

We can conclude by the following rather conservative assessment. Recalling the definition 
of the dimensionless variable x = At, where A is the average seismicity rate of a given region 
and r is a realization of the random variable defined as the inter-event time between two 
successive events in that region, we observe on the one-hand that the range of dimension- 
less inter-event times x > 0.1 is probably quite reliable from an empirical view point but 
the corresponding pdf's are remarkably insensitive to the specific values of the clustering 
parameter, Omori law exponent, productivity exponent and completeness of the catalogs. 
On the other hand, the range of x < 0.1 which would promise to give more sensitivity is 
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not only highly unreliable but also lacks significant power to obtain a good inversion due 
to the existence of many almost equally good fits with quite different sets of parameters. 
Our theoretical analysis and its comparison with Corral's data does not seem to support the 
proposition that inter-event time distributions could provide a new and more reliable way 
to measure of the so-called background earthquake activity as suggested in Ref. [14, 15]. 
Acknowledgements: We are grateful to A. Corral for sharing his data with us. 
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Appendix A: Numerical solution for the linear approximation 

This appendix provides a validation step of the numerical discretization scheme that we 
have developed to solve the nonlinear equations (29) and (30). Here, we apply this scheme to 
the linear approximation and compare the result with those which are available analytically. 
In the linear case, the equation (30) for g(y) reduces to the following implicit linear integral 
equation 



where 5 is defined in (17). To solve (34), we use the method of successive approximations to 
obtain the value of the function g(yi) on a regular grid yi = y + i ■ dy with a small mesh dy. 
This method is adapted to the treatment of the convolution integral in the right-hand-side 
of (34). This simple method is fast and provides good convergence. For example, with 
dy ~ 10~ 4 , the calculation converges on the 15-th iteration with a residual absolute error 



This is illustrated in Fig. 11 which shows the function g(y) in the linear approximation, 
obtained directly from the numerical solution of equation (34) for dy = 10~ 5 and 10 -4 , and 
by using the equation for M_(y) for dy = 10~ 5 and 10 -4 . As mentioned in the main text, 
the convergence is faster when using g{y) compared with using M_(y). Eventually, as the 
mesh size goes to zero, both methods converge towards the same estimation. An illustration 
of this convergence is given with dy = 10 -5 , which shows much better agreement between 
the two estimations compared with the results obtained for dy = 10~ 4 . The function g(y) 
obtained by solving (34) is almost identical for dy = 10~ 5 and 10~ 4 , demonstrating the 
faster convergence of this scheme. Fig. 11 also shows that the quasi-static approximation is 
not perfect, but exhibits a relative error of about no more than 1% in this example. Once 
g(y) has been obtained, the distribution of inter-event times is obtained from equation (18), 
which can be expressed here as 



and with (21). Using the function g(y) obtained by solving (34) with dy = 10~ 5 gives the 
dimensionless pdf of inter-event times shown in Fig. 11. For comparison is also shown the 
pdf obtained by Saichev and Sornette with the quasi-static approximation [33, 34]. There 
is an excellent agreement between the two methods. 



g(y) = 5[a £ (y) + $ e (y) <g> g(y)} , 



(34) 



io- 15 . 




(35) 
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Appendix B: ETAS simulations 

This appendix describes how we construct the pdf of inter-event times in specific synthetic 
catalogs generated with the ETAS model. Actually, we do not generate synthetic catalogs. 
Instead, we use the analytical form of the cumulative distribution function (cdf) F k+1 {r) 
of the waiting time r between the fc-th and the (k + l)-th event, knowing the times and 
magnitudes of the preceding k events, to draw the occurrence time of this (k + l)-th event. 
Generating in this way 1000 or more inter-event times, we use logarithmic bins to construct 
the histogram of these inter-event times. This construction provides the "true" or "exact" 
numerical benchmark against which to compare our theory and the empirical data. 

The cdf F k+ i(r) is obtained by recurrence as follows. For the first event, Fi(t) is nothing 
but the cumulative probability of occurrence of a spontaneous (background) shock since the 
origin of time, given by definition by the Poisson law with rate u: 

F 1 (t) = 1 - e~" T . (36) 

The cdf F 2 (t) of the waiting time from the first to the second shock is made of two 
contributions: (i) the second shock may again be a background event or (ii) it may be 
triggered by the first shock. This yields 

F 2 ( T ) = 1 - e -^ e - P l(l-a(r)) ^ ( 37 ) 

where p\ = p{mi) is the productivity of the first shock obtained from expression (3) given 
its magnitude mi, and a(r) is defined in (20). 

All following shocks are similarly either a background event or triggered by one of the 
preceding events. The cdf F 3 (t) of the waiting time between the second and the third shocks 
is thus given by 

F 3 (t) = 1 — e - wr e -Ma(T2)-a(T2+T))-p2(l-a(V)) ^ j-gg-j 

where r 2 is the realized time interval between the first and the second shocks. Iterating, 
we obtain the cdf F k {r) for the waiting between the (k — l)-th and k-th shocks under the 
following form 

F fc (r) = l-e^ r exp 



k-l 

£ 



Pk- 





+ r 



(39) 
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where Tj is the waiting time between the (j — l)-th event and the j-th event, and pi = p(rrii) 
is the productivity of the i-th shock obtained from expression (3) given its magnitude m^. 

In order to generate the (k + l)-th inter-event time interval between the occurrence of 
the fc-th and (k + l)-th shock, it is necessary to know the k previous inter-events times 
between the k previous shocks and their k magnitudes. Since, in the ETAS model, the 
magnitudes are drawn independently according to the Gutenberg- Richter distribution (1), 
they can be generated once for all. In order to generate a catalog of N events, we thus 
draw iV magnitudes from the law (1). In order to generate the corresponding N inter-event 
times, we use the expression (39) iteratively from k = 1 to k = N in a standard way: since 
any cdf F(x) of a random variable x is by construction itself uniformly distributed in [0, 1], 
we obtain a given realization x* of the random variable x by drawing a random number 
r uniformly in [0, 1] and by solving the equation F(x*) = r. In our case, we generate N 
independent uniformly distributed random numbers Xi,...,Xn in [0,1] and determine each 
Ti successively as the solution of Fj(rj) = X{. 
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FIG. 1: Non-dimensional probability density function f(x) of inter-events times for rrid — = 
2,7 = 1.2, n = 0.9,$ = 0.03 (continuous line) compared with Corral's phenomenological expression 
[3] (dashed-dotted line). 
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FIG. 2: Solution g(y) of the nonlinear equation (30) (continuous line) and quasi-static approxima- 
tion (19) (dashed line). The parameters of the ETAS model are md — tuq = 2, 7 = 1.2, n = 0.9, 9 = 
0.03. 
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FIG. 3: Function g(y) obtained with different schemes: (solid) - quasi-static approximation devel- 
oped in [33, 34]; numerical solution of equation (34) for (dash-dotted) dy = 10 -5 and (long dashed) 
- 1CT 4 ; numerical of solution of (28) for M_(y) yielding g(y) = 1 — M_(y): (dotted) dy = 1CT 5 
and (dashed) 10~ 4 . The ETAS parameters are — tuq = 2,7 = 1.2, n = 0.9,6* = 0.03. 
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FIG. 4: Probability P(y) defined in (9) obtained by the numerical solution of the nonlinear equa- 
tions (29) and (30) (continuous line) compared with the quasi-static approximation reported in 
Ref. [33, 34] (dashed line). The ETAS parameters are m d - m = 2, 7 = 1.2, n = 0.9, 6 = 0.03. 
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Comparing linear and nonlinear models 




FIG. 5: Pdf of dimensionless inter-event times for the ETAS parameters equal to n = 0.9, 6 = 
0.05,7 = 1-1) m d — rriQ = {). The pdf obtained with the nonlinear theory leading to the equations 
(29) and (30) is shown with the solid line. The pdf obtained with the linear theory is shown as the 
dashed line. The "exact" pdf obtained by the simulation method described in Appendix B is shown 
in histogram form (circles) and with a smoothing fit (dashed-dotted) performed with a function 
defined as the ratio of a polynomial function of 5-th order over another polynomial function of 4-th 
order, in terms of the logarithm of the dimensionless inter-event times. The dotted line shows the 
pdf obtained with the linear theory with effective parameters n c g = 0.86 and 7 e ff = 1.28, chosen 
to fit the "exact" pdf in the region x > 4 • 10~ 2 where the nonlinear theory is performing well. 
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FIG. 6: Test showing that using the exact values of the incomplete gamma function (13) in the 
calculation of equation (29) is undistinguishable from the solution obtained with the expansion 
(14) which includes all four terms up to second-order. 




FIG. 7: Comparison between the "exact" pdf obtained from a synthetic ETAS catalog (crosses) with 
n = 0.86,0 = 0.05,7 = 1-H an d the nonlinear theory without (dashed line) and with (continuous 
line) adaptive mesh grid size as described in the text. 
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Error in estimation of key parameters with Log Least Squares method 




FIG. 8: Comparison between (i) the "exact" pdf of inter-event times (crosses) generated with the 
ETAS model according to the method described in Appendix B with n cat = 0.86, 6 cat = 0.05, 7 ca t = 
1.11 and {nid — mo} ca t = 0, which mimics a real-life situation containing fluctuations, (ii) the best 
fit (dashed line) with the nonlinear theory and (hi) the pdf obtained with the nonlinear theory 
with the true values of the parameters (continuous line). For simplicity, we impose the true values 
Q = #cat = 0.05 and ma — tuq = {nid — mo} cat = in the best fit and invert for the two other 
parameters, which yields nb cs t fit = 0.94 ± 0.02 and 7bcst fit = 1-13 ± 0.02. 
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FIG. 9: Comparison between the pdf's of inter-event times in synthetic ETAS catalogs with three 
different sets of parameters and the pdf's obtained by Corral in different regions of the world. 
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FIG. 10: Comparison between the pdf's obtained by Corral in different regions of the world and 
the 36 pdf's of inter-event times calculated with the nonlinear theory for all combinations of the 
following sets (with a fixed 6 = 0.05): n = 0.64,0.8,0.96; (ii) 7 = 1.01, 1.07, 1.13; (hi) m d - m = 
0.1,1; (iv) e(c) = 10~ 4 ,10~ 5 . Solid lines: rrid — tuq = 0.1, c = 10~ 4 ; dashed lines: m& — = 
0.1, c = 10~ 5 ; dashed-dotted lines: — mo = 1, c = 10~ 4 ; dotted lines: — tuq = 1, c = 10~ 5 . 
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FIG. 11: Probability density functions (pdf) of inter-event times obtained by using the function 
g(y) solution of (34) with dy = 10~ 5 , with (35) and (21) (continuous line). The pdf obtained 
by Saichev and Sornette with the quasi-static approximation [33, 34] is shown as the dashed 
line. Corral's fitting curve is the dotted-dashed line. The parameters of the ETAS model are 
m d - mo = 2, 7 = 1.2, n = 0.9, 9 = 0.03. 



31 



